Should We Abandon the t-Test in the Analysis of Gene Expression Microarray Data: A Comparison of Variance Modeling Strategies

نویسندگان

  • Marine Jeanmougin
  • Aurelien de Reynies
  • Laetitia Marisa
  • Caroline Paccard
  • Gregory Nuel
  • Mickael Guedj
چکیده

High-throughput post-genomic studies are now routinely and promisingly investigated in biological and biomedical research. The main statistical approach to select genes differentially expressed between two groups is to apply a t-test, which is subject of criticism in the literature. Numerous alternatives have been developed based on different and innovative variance modeling strategies. However, a critical issue is that selecting a different test usually leads to a different gene list. In this context and given the current tendency to apply the t-test, identifying the most efficient approach in practice remains crucial. To provide elements to answer, we conduct a comparison of eight tests representative of variance modeling strategies in gene expression data: Welch's t-test, ANOVA [1], Wilcoxon's test, SAM [2], RVM [3], limma [4], VarMixt [5] and SMVar [6]. Our comparison process relies on four steps (gene list analysis, simulations, spike-in data and re-sampling) to formulate comprehensive and robust conclusions about test performance, in terms of statistical power, false-positive rate, execution time and ease of use. Our results raise concerns about the ability of some methods to control the expected number of false positives at a desirable level. Besides, two tests (limma and VarMixt) show significant improvement compared to the t-test, in particular to deal with small sample sizes. In addition limma presents several practical advantages, so we advocate its application to analyze gene expression data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

Global gene expression analysis using microarray to study differential vulnerability to neurodegeneration

Neurodegenerative disorders such as Parkinson’s disease, motor neuron disease and Alzheimer’s disease is characterized by loss of specific cells within certain regions of the brain. One of the most compelling questions is to determine why specific cell populations are vulnerable to neurodegeneration. We addressed this question by studying global gene expression changes using an animal model of ...

متن کامل

Global gene expression analysis using microarray to study differential vulnerability to neurodegeneration

Neurodegenerative disorders such as Parkinson’s disease, motor neuron disease and Alzheimer’s disease is characterized by loss of specific cells within certain regions of the brain. One of the most compelling questions is to determine why specific cell populations are vulnerable to neurodegeneration. We addressed this question by studying global gene expression changes using an animal model of ...

متن کامل

Analysis of Gene Expression, Signaling Pathways, and Interaction Networks of Some Effective Genes in Patients with Asthma in Microarray Studies Using R Software

 Background and purpose: Asthma is a chronic inflammatory disorder of the airways caused by a combination of complex environmental and genetic interactions. There is an incomplete understanding of this mechanism which affect both severity of the disease and how it responds to treatment. Different gene expressions are reported in patients with asthma and healthy controls. Materials and methods:...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2010